Lost in Translation: Authorship Attribution using Frame Semantics

نویسندگان

  • Steffen Hedegaard
  • Jakob Grue Simonsen
چکیده

We investigate authorship attribution using classifiers based on frame semantics. The purpose is to discover whether adding semantic information to lexical and syntactic methods for authorship attribution will improve them, specifically to address the difficult problem of authorship attribution of translated texts. Our results suggest (i) that frame-based classifiers are usable for author attribution of both translated and untranslated texts; (ii) that framebased classifiers generally perform worse than the baseline classifiers for untranslated texts, but (iii) perform as well as, or superior to the baseline classifiers on translated texts; (iv) that—contrary to current belief—naïve classifiers based on lexical markers may perform tolerably on translated texts if the combination of author and translator is present in the training set of a classifier.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Translation and Hybridity in Scenes and Frames Semantics

 The present study is a theoretical attempt to illustrate how Fillmore's Scenes and Frames Semantics (SFS) could be employed as a framework to portray the process of understanding and translating hybrid texts. It first reviews the origin of SFS; then it maps SFS onto Nida’s linguistic model of translation process and the Interpretive Theory of Translation; it examines in the next section, withi...

متن کامل

Application of Frame Semantics to Teaching Seeing and Hearing Vocabulary to Iranian EFL Learners

A term in one language rarely has an absolute synonymous meaning in the same language; besides, it rarely has an equivalent meaning in an L2. English synonyms of seeing and hearing are particularly grammatically and semantically different. Frame semantics is a good tool for discovering differences between synonymous words in L2 and differences between supposed L1 and L2 equivalents. Vocabulary ...

متن کامل

Cross-Language Authorship Attribution

This paper presents a novel task of cross-language authorship attribution (CLAA), an extension of authorship attribution task to multilingual settings: given data labelled with authors in language X , the objective is to determine the author of a document written in language Y , where X 6= Y . We propose a number of cross-language stylometric features for the task of CLAA, such as those based o...

متن کامل

Deep Level Lexical Features for Cross-lingual Authorship Attribution

Crosslingual document classification aims to classify documents written in different languages that share a common genre, topic or author. Knowledge-based methods and others based on machine translation deliver state-of-the-art classification accuracy, however because of their reliance on external resources, poorly resourced languages present a challenge for these type of methods. In this paper...

متن کامل

Robust Word Sense Translation by EM Learning of Frame Semantics

We propose a robust method of automatically constructing a bilingual word sense dictionary from readily available monolingual ontologies by using estimation-maximization, without any annotated training data or manual tuning. We demonstrate our method on the English FrameNet and Chinese HowNet structures. Owing to the robustness of EM iterations in improving translation likelihoods, our word sen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011